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One strategy for reducing the online computational cost of matched-filter searches for gravitational 
waves is to introduce a compressed basis for the waveform template bank in a grid-based search. In 
this paper, we propose and investigate several tunable compression schemes for a general template 
bank. Through offline compression, such schemes are shown to yield faster detection and localisation 
of signals, along with moderately improved sensitivity and accuracy over coarsened banks at the 
same level of computational cost. This is potentially useful for any search involving template banks, 
and especially in the analysis of data from future space-based detectors such as eLISA, for which 
online grid searches are difficult due to the long-duration waveforms and large parameter spaces. 
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I. INTRODUCTION 

Advanced LIGO [I| has recently made the first direct 
detection of gravitational waves (GWs) from an astro- 
physical source 0; more detections are now expected 
routinely from the ground-based interferometer network 
comprising Advanced LIGO and Advanced Virgo Q- 
These should be followed over the next two decades by 
detections of nanohertz GW sources using pulsar tim¬ 
ing arrays and of millihertz sources by the proposed 
space-based detector eLISA Q or more ambitious mis¬ 
sions such as DEGIGO The search for GW signals 
in noisy data from such detectors—and the follow-up es¬ 
timation of their source parameters—is contingent upon 
reliable statistical analysis of the data. 

GW signals from sources such as stellar-mass compact 
binary coalescences or massive black-hole binary inspi¬ 
rals are typically weak compared to the detector noise 
in which they are embedded. The standard approach in 
GW data analysis is to correlate the detector data with a 
bank of waveform templates sampled from the parameter 
space of a waveform model, which allows signal-to-noise 
ratio (SNR) to be built up over the detector bandwidth. 
This correlation is essentially an inner product on the 
function space of finite-length time series; it must be eval¬ 
uated numerically for each template, and carries the bulk 
of the computational cost in online GW searches [3,13| . 

Various strategies exist to reduce the online cost of 
evaluating inner products for GW detection and param¬ 
eter estimation purposes, typically by shifting the com¬ 
putational burden to the preparatory offline stage. Some 
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methods focus on making individual inner products com¬ 
putationally cheaper: this may be achieved across r^ions 
of parameter space through direct interpolation [9|, |l0| , 
or more generally by using a reduced order quadrature 
pin. Other methods seek to reduce the number of 
required inner products: either by accelerating the con¬ 
vergence to correlation maxima in a stochastic search 
or through reduced-basis decomposition of the 
template bank in a grid search [l6l - P . 

In a recently proposed method for evaluating fewer in¬ 
ner products in a grid search, binary labelling is used 
to define a compressed non-orthogonal basis that max¬ 
imises compression losslessly (in the sense of perfect sig¬ 
nal recovery without noise) [l9|. This idea is fully gen¬ 
eral and admits a much higher compression rate than 
existing methods based on the eigenvalue structure of 
the template bank, but comes with significant penalties 
to detection sensitivity and identification accuracy in the 
presence of detector noise. The method as originally de¬ 
scribed also suffers from an arbitrarily asymmetric treat¬ 
ment of templates, as well as a restrictive level of com¬ 
pression that limits its practicality to high-SNR signals. 
While the binary labelling method might be useful in 
the context of eLISA (where source SNRs are potentially 
higher than for ground-based detectors), its practical ap¬ 
plicability to GW data analysis remains undeveloped and 
hence unclear. 

In this paper, we introduce and develop the related 
method of conic compression (i.e. defining a compressed 
basis through conic combinations of templates) by char¬ 
acterising its performance under various simplifying as¬ 
sumptions, before investigating its viability for current 
and future GW detectors with a more realistic example. 
We propose several compression schemes, one of which 
subsumes a symmetric-treatment version of the binary 
labelling method [l^ as a particular case. These tun- 
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able schemes feature discrete transitions between zero 
and maximal compression, and offer fast detection and 
localisation of GW signals in the search space with a 
controlled loss (if at all) in sensitivity/accuracy. Their 
generality and straightforward implementation also al¬ 
low them to supplement existing grid-search methods, or 
to rapidly identify seed points for stochastic searches. 

The general method of conic compression is set out in 
Sec.lH Three families of conic compression schemes are 
then proposed in Secs III AHII Cl a lossy scheme based on 
partitions of the template bank, and two lossless schemes 
whose conic combinations are determined permutatively 
or by base representations of template labels. We cal¬ 
culate the optimal detection statistics for these schemes, 
and find that the standard maximum-overlap statistic is 
significantly suboptimal for detection in the lossless case. 
Sec. Ill D] compares the three schemes under simplified 
conditions, i.e. assuming the GW signal is proportional 
to a single template in an orthogonal template bank. 
The lossy partition scheme is shown to have slightly bet¬ 
ter detection sensitivity than its lossless counterparts at 
the same level of compression. Furthermore, while the 
lossless schemes provide automatic identification (i.e. lo¬ 
calisation to a single template) of the signal upon de¬ 
tection, the identification accuracy falls off more rapidly 
with compression in the presence of noise. 

We focus exclusively on the partition scheme in Sec. lIIIl 
where the orthogonality and single-template assumptions 
are lifted separately. As shown in Sec. lIII Al the over¬ 
all performance of the scheme is partition-dependent in 
the case of a correlated (non-orthogonal) template bank, 
and must be pre-optimised by grouping highly correlated 
templates together. The optimised partition scheme re¬ 
tains the benefits of a correlated template bank up to 
high levels of compression, and is superior to a simple 
“coarsening” of the template bank (obtained by increas¬ 
ing the maximal mismatch between neighbouring tem¬ 
plates). Sec. lIII Bl discusses the case of a GW signal lying 
in a low-dimensional subspace of an orthogonal template 
bank, for which the detection sensitivity of the scheme is 
not significantly reduced. 

In Sec. lIVl we implement the optimised partition 
scheme for a highly correlated (maximal mismatch Ri 
0.01) template bank of ^ 10'’’ post-Newtonian (PN) wave¬ 
forms, which describe the gravitational radiation emitted 
during the inspiral phase of a comparable-mass binary 
merger. The scheme is shown to be viable for practical 
applications, as it performs well on this example up to 
high levels of compression and at all considered values of 
SNR. Its detection rate for a signal injected centrally is 
superior to that of the coarsening approach (especially at 
compression rates of over 80%), and this improvement is 
even more marked for a signal injected at the boundary of 
the bank. In addition, the accuracy rate for localisation 
of the injection to a < 0.1% region of the search space 
is undiminished up to a compression level of 90%, and is 
again higher than that of the coarsening approach. 

The considerable speed-up and enhanced accuracy in 


localising the GW signal with conic compression is par¬ 
ticularly promising for eLISA data analysis, where the 
online use of template banks is made challenging by the 
large parameter spaces of typical sources [^. While 
the long duration of eLISA signals is computationally 
prohibitive to fully coherent searches even with com¬ 
pression, our method is suitable for the shorter semi- 
coherent searches that are required for rapid electromag¬ 
netic follow-up. 

Conic compression might also provide a viable alterna¬ 
tive to the singular-value-decomposition (SVD) method 
used in LIGO detection pipelines for compact binary co¬ 
alescences it scales well with parameter-space di¬ 
mensionality and easily matches or surpasses the order- 
of-magnitude computational savings of the SVD method, 
with any loss of SNR coming mainly from the maximal 
mismatch of the original template bank (rather than an 
SVD reconstruction). Furthermore, our method may in 
principle be used to further compress the reduced bases 
obtained through the various orthogonal-decomposition 
methods [la - ll^ . Whether any computational benefits 
might be gained from such a combination of the two ap¬ 
proaches is left for future investigation. 


II. COMPRESSION SCHEMES 


In the standard GW data analysis framework, data 
from a detector may be written as the time series 

A(t)=5(t)-fAr(t), (1) 

where the GW signal S{t) is a deterministic function of 
time (and some unknown source parameters), and the 
additive detector noise A/’(t) is a Gaussian and stationary 
stochastic process. 

Matched filtering involves passing the data through 
some GW template filter J'(t) via convolution, which de¬ 
fines an inner product on the function space of finite- 
length time series (2lj |. This inner product is given by 


{X\F) = 


/■“ X{f)P(f) 
7-00 SmU) 


( 2 ) 


where Sj^{f) is the two-sided spectral density of the de¬ 
tector noise. Since A/’(t) is stationary, Sj^{f) is sim¬ 
ply the Fourier transform of the autocorrelation function 
= E(A/’(t)A/’(< — r)), and we have the identity 

E((Ar|J-)(Ar|J-')) = {X\X'). (3) 


The SNR pjr of the filtered data is then related to the 
true SNR p by 




{X\F) 


( 4 ) 


We now consider a generic bank of N GW templates 
hn{t), where the template labels n are drawn from the 
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collection N := {n G | n < N}, and the templates 
have been normalised such that = 1 for all n G 

N. The inner products of the data CD and the templates 
define N associated statistics 

:= {X\K), (5) 

which may be used for detection and localisation in a 
simple grid search. 

Our general method of compression is to reduce the 
number of statistic evaluations from TV to M by consid¬ 
ering conic (i.e. positive-coefficient) combinations of the 
original templates. The template labels are grouped into 
M sets Um, where the set labels m are drawn from the 
collection M := {m G \m < M}, and the sets satisfy 
UmSM = N. These sets define M conic templates 

Hmit) ■■= ^ hn(t), ( 6 ) 

neUm 

which are prepared at the offline stage (like the template 
bank itself), along with M associated statistics 

Xm.= {X\Hm)= Y. 

which are evaluated at the online stage. 

Without any prior assumptions on the template bank, 
each template must be treated equally. This is done by 
ensuring that: 

(a) each combination is weighted equally; 

(b) each combination includes the same number of tem¬ 
plates; 

(c) each template is included in the same number of 
combinations. 

Definition (ED has been chosen to satisfy condition (a), 
while condition (b) is imposed by further requiring 
card(Um) = card(Um') for all m,m' G M (where the 
set cardinality card(S) is the number of elements in the 
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Pi{x) oc 


set S). Condition (c) must be enforced separately in the 
construction of the sets. The second equality in Defi¬ 
nition dZD relates the conic statistic evaluations to the 
original statistics ([S]), which are no longer evaluated at 
the online stage. 

To simplify analysis, we first assume the template bank 
is an orthogonal set such that 

— ^nn' 7 (3) 

where Sij is the Kronecker delta. We further assume 
the GW signal (if present) lies in the one-dimensional 
subspace spanned by a single template in Hilbert space, 

i.e. 

S{t)=Ah,{t), (9) 

where H > 0 and the templates have been relabelled 
without loss of generality. It follows from (|3]) and ([HD that 
A = p. These orthogonal and 1-D restrictions are neither 
realistic nor optimal, but facilitate the analytic assess¬ 
ment and comparison of various compression schemes in 
this section. The overall performance of conic compres¬ 
sion is generally improved by the lifting of these assump¬ 
tions, which we consider in Secs IIIIl and IIVI 

In the presence of a GW signal, the expectation val¬ 
ues and covariances of the normally distributed original 
statistics ED are now given by 

^.{Xn) = A{hi\hn) = A5in, (10) 

CO\{Xn7 Xn') = {hn\hn') = 5nn'- (H) 

As the labelling of templates is itself a probabilistic 
process with discrete uniform distribution, the original 
statistic vector x has the multivariate Gaussian dis¬ 
tribution Q{p^^\Ti) (with = E(x„) and = 

cov{xn 7 Xn')), but Summed over the N possible assign¬ 
ments i of I G N and renormalised accordingly. If the 
signal is absent, the distribution of x is simply C/(0, E). 
Hence we have 


x + ( 12 ) 


Poix) oc exp 



(13) 


where pi and po are the probability density functions of 
X in the respective presence or absence of a GW signal. 

An optimal detection region TZ in Hilbert space max¬ 


imises the detection rate Pd = fj^Pi subject to a given 
false alarm rate Pp = f-j^Pol hence pi = Xpo on its 
boundary dTZ for some Lagrange multiplier A. Using 
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FIG. 1: Three-dimensional projection of optimal detection 
surface for uncorrelated statistics x„, at true SNR of 2. 

(uni) (US), we define the optimal detection statistic 

such that the optimal detection surfaces &TZ are precisely 
the level sets of Xopt parametrised by A, and a detection 
is claimed if Xopt exceeds the threshold At corresponding 
to some fixed value of Pp- 

In deriving (fT^ . we have implicitly assumed a popu¬ 
lation of GW sources with equal likelihood and known 
signal amplitude. Eq. (ITil) therefore defines the optimal 
statistic for detecting events drawn from such a popula¬ 
tion. For a population of sources that are not equally 
likely, we need to replace the sum in (ITU) with a suitably 
weighted sum. Similarly, for a population with a distri¬ 
bution of amplitudes, we need to marginalise dHD over 
A; in the case of an (improper) uniform prior over the 
interval (— 00 , 00 ), this would give a detection statistic 
proportional to ®^P(^n/2)- 

Any choice of population makes assumptions about the 
astrophysical distribution of GW sources that might not 
be justified. In this paper, the focus is on the inves¬ 
tigation and comparison of template bank compression 
schemes, and so we only consider the equal-likelihood 
and known-amplitude population assumed in the deriva¬ 
tion of (fTTl) . While the treatment of amplitude in partic¬ 
ular is artificial, a search that is optimised for sensitivity 
to signal amplitudes around the detection threshold will 
likely be near-optimal for any given astrophysical popula¬ 
tion (and closer to optimality than a search tuned for the 
wrong astrophysical population). Finally, we note that 
although (HI has been derived as a frequentist optimal 
statistic, the same equation also arises as the Bayes fac¬ 
tor for the presence (versus absence) of a signal, assuming 
flat model priors and the source population assumptions 
outlined above. 


For sufficiently high SNR (large A), the optimal sur¬ 
faces Topt = A defined by (fTTl) are well approximated 
by semi-infinite hypercubes in Hilbert space (see Fig.[T]), 
i.e. the level sets of the standard grid-search detection 
statistic [ 234 ^ 

a^max = max{a;„}. (15) 

n^N 

Since the original statistics are uncorrelated, the 
probability density functions of Xmax in the presence or 
absence of a GW signal are obtainable explicitly. These 
are given respectively by 

— Fo(Xniax) /l(a^max) 

+ (TV - l)Fo(Xn,ax)'^“"Fi(x^ax)/o(Xmax), (16) 

Qo(^max) — ^^o(^max) fo(^max): (1"^) 

where fsixmax) is the probability density function for the 
Gaussian distribution G{sA, 1), and Fs{xraax) is the cu¬ 
mulative distribution function 

/ ^max 

fs{u)du. (18) 

-OO 

For our analysis of conic compression schemes, we also 
require the expectation values and covariances of the nor¬ 
mally distributed conic statistics ©■ From dZl), (HHl) and 
(ED, it follows in the presence of a GW signal that 

E{Xjn)= ^ E(a;„) = Acard({l} nUm), (19) 

neUm 

cov(Ar 

m j ^m') = E E COv{Xn,Xn') 

nGUm ra'GU^/ 

= card(Um n 11^'), (20) 

where the cardinalities are determined by the choice of 
compression scheme. As before, the conic statistic vector 
X has the multivariate Gaussian distribution ^(/iO)^!]) 
(now with = E(Xm) and = cov{Xm, Xm')), 

but summed over the N possible assignments of 1 G N 
and renormalised accordingly. If the signal is absent, 
the distribution of X is again t/(0, E). The probability 
density functions of X in the presence or absence of a 
GW signal are then given respectively by m and m 
with X = X. 

We now propose and investigate three general conic 
compression schemes in Secs III Al - lII Cl before comparing 
their performance and potential applicability in Sec. Ill Dl 
The orthogonal and 1-D restrictions ([5|) and are as¬ 
sumed throughout Sec.HIl 

A. Partition scheme 

The simplest method of grouping the template labels 
n is to take the family of sets XJm as a partition of N, 
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i.e. U™ nUm' = 0 for all distinct m,m' £ M. Condition 
(c) is then automatically satisfied, while condition (b) 
defines the set cardinality P = card(Um) for all m £ M. 
It follows that M = N/P. 

For the comparison of schemes in Sec. lII Dl it is useful 
to introduce a compression parameter K £ IP for each 
scheme, which determines the compression rate 


K := 1 — 


P Bval 

N ’ 


( 21 ) 


where A^evai = M is the required number of statistic eval¬ 
uations (for detection or localisation purposes). This gen¬ 
erates a sliding scale of groupings that ranges from no 
compression at = 1 to maximal compression at some 
scheme-dependent value of K. We may clearly choose 
K = P for the partition scheme, such that maximal 
compression is given hy K = N. The minimal nontrivial 
compression is 50% at K = 2, while there are diminishing 
returns at large K since k{K) is concave-down. 

From (fTOl) and (OiH) . we now have 


N = 256 



E(-X^r?T,) — 

(22) 


(23) 


where the sets have been relabelled such that 1 £ Ui 
without loss of generality. Again considering the N possi¬ 
ble assignments of 1 £ N, the optimal detection statistic 
Aopt := pi(X)/po(P) follows from (|T^ and (fT^ (with 
a: = A) as 



Since the conic statistics for the partition scheme re¬ 
main uncorrelated, the optimal surfaces Aopt = A resem¬ 
ble that in Fig.[Tl and in lieu of (l24l) it is valid to consider 
the maximum-overlap detection statistic 

Aniax — max{A^}. (^^) 

meM 

Receiver operating characteristic (ROC) curves of detec¬ 
tion rate Pd against false alarm rate Pp for both the 
optimal and maximum-overlap statistics are compared 
in Fig. [5]^ With increased compression, the performance 
of the maximum-overlap statistic falls away slightly from 
that of the optimal statistic, due to the lowering of ef¬ 
fective SNR A/'/p in (l24ll : nevertheless, (|25ll is a sound 
approximation as both sets of ROC curves show good 
overall agreement. 

For the partition scheme to admit a useful (i.e. pop¬ 
ulated) sliding scale of compression rates, the template 


^ The curves for m were obtained via 10^-trial Monte Carlo sim¬ 
ulations, while numerical integration of cs and C3 was used 
to generate quicker and more precise curves for ll25ll . 


FIG. 2: ROC curves for the partition scheme’s optimal and 
maximum-overlap detection statistics, at different values of 
set cardinality P (with compression rate k in parentheses) 
for a 256-template bank and a true SNR of 10. The dashed 
diagonal line indicates the worst possible performance, i.e. a 
random search for which the detection and false alarm rates 
are equal. 


bank might need to be trimmed or padded such that 
N has as many divisors as possible. Fixing the false 
alarm rate and choosing either a desired detection rate 
or a compression rate then allows advance determination 
of the conic templates (H)) and the threshold At, which 
is the value of A corresponding to the fixed false alarm 
rate. The algorithm for GW detection follows as: (i) 
evaluate the conic statistics 0; ( ii) claim a detection 
if Xynax > At- Threshold and detection SNRs for the 
maximum-overlap statistic are defined respectively as 


Pt ■= 


At 

var(Amax) 


(26) 


Pd ■= 


X^ 


\J var(Amax) 


(27) 


An extension of the detection algorithm is required for 
identification purposes (i.e. localisation to a single tem¬ 
plate), since the simple coarse-graining of partition com¬ 
pression does not distinguish between template labels in 
the same set. The signal is most likely to be associated 
with the largest conic statistic evaluation A(i), so the 
best candidate template may be obtained by further eval¬ 
uating all the original statistics Xn contributing to 
and identifying the largest. This finer level of evaluations 
increases the computational cost by P to Aevai = M + P. 

For better identification accuracy at lower SNRs, we 
may widen our search to the i largest X^ instead, at 
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an added computational cost of iP. The standard algo¬ 
rithms li for GW identification follow (after detection) 
as: (iii) evaluate the original statistics ([5]) for all n 
where 


i 

V.= UU0T (28) 

with U(j) corresponding to the j-th largest conic statistic 
evaluation; (iv) identify max„gVi{a:n}- 

Other identification algorithms may also be considered. 
One such alternative is obtained by defining a further 
partition of Vi into two sets and evaluating the associ¬ 
ated conic statistics, then identifying the set V' corre¬ 
sponding to the larger statistic evaluation and repeating 
the process with Vi = V' until card(V') = 1. This 
method might be useful for large values of P; it yields 
a smaller added computational cost of 2 log 2 iP, but in¬ 
curs a penalty to identification accuracy since the early 
iterations still involve coarse-grained searches. 


B. Symmetric base scheme 


Without an additional fine-grained search, partition 
compression is lossy in the sense that the GW signal is 
not automatically identified in the limit of zero noise. A 
recently proposed conic compression scheme introduces 
a lossless method of compression, by representing each 
template label n in binary and assigning it to the set Um 
if its TO-th digit is 1 M- This binary scheme features the 
largest possible lossless compression (M = log 2 N) and 
an automatic identification of the GW signal; however, 
it suffers from an unequal treatment of templates (i.e. 
it violates conditions (b) and (c)) and hence it yields 
an arbitrary level of performance that depends on the 
initial assignment of template labels. Furthermore, the 
restriction to maximum compression limits its usefulness 
in practical applications. 

We propose a compression scheme modelled on the bi¬ 
nary labelling method, but symmetrised (for equal treat¬ 
ment of templates) and generalised to a sliding scale of 
base representations (for tunable compression). The tem¬ 
plate labels n are represented modulo N in base B, and 
each set \Jm = is constructed by collecting all the 
labels whose A:-th digit is b (this includes 6 = 0, and 
gives a symmetric version of the binary scheme [l^ when 
B — 2). For conditions (b) and (c) to be satisfied, we re¬ 
quire logg N G Z+; it follows that M = B logg N. 

The compression parameter is chosen as K = log^ N, 
such that maximal compression is given hy K = log 3 N Ri 
InfV (base-2 compression is slightly suboptimal with 
symmetrisation). In contrast to the partition scheme, 
compression for the symmetric base scheme is dependent 
on the size of the template bank; the minimal nontriv¬ 
ial compression for N = 10^ is nearly 80% at AT = 2 
(base-\/]V compression), and over 95% for N = 10'^. 


From m and dini), we have 


E(A^) = ASob, (29) 


cov{Xm,X^,) = B^ ‘^{BSkk'hb' - Skk' + 1), (30) 

where m = B{k — 1) + b + 1, and the templates have 
been relabelled such that S{t) = Ahi^{t) without loss of 
generality.^ Considering the N possible assignments of 
iV S N, the optimal detection statistic follows from © 
and (fT^ as 


Vopt = — exp ( - 


-^^ + {li-a)Air{X)\ 


2 ' 

K B-1 

HE exp(aAAfe,b), (31) 




B ^ M-K+1 
NK ’ 


(32) 


where tr(A) := 

The higher compression rates provided by the symmet¬ 
ric base scheme result from the non-empty intersections 
among the sets \Jk,b with different values of k. As seen 
in dsni), these also lead to correlations among the conic 
statistics X^^b- The optimal detection surfaces given by 
Aopt = A differ signihcantly from that depicted in Fig. [T] 
their projections onto the correlated subspaces are now 
compact hyperboloids, and no longer approach the semi- 
inhnite hypercubes of the maximum-overlap detection 
statistic at high SNR (see Fig.|3]). 

Without a simple approximation for the optimal de¬ 
tection statistic, the most feasible option is to use eiD 
itself with an estimate A of the true SNR. ROC curves 
for the estimated statistic X^'^^ with e G {1/2,2} are 
compared against those for the optimal and maximum- 
overlap statistics in Fig.|Tl Not much detection sensitiv¬ 
ity (for a fixed false alarm rate) is lost if the true SNR can 
be estimated to within a factor of two, while usage of the 
maximum-overlap statistic now incurs a more noticeable 
drop in performance as expected. 

The restriction of N, B and K to integer values also re¬ 
sults in more sparsely populated sliding scales than those 
admitted by the partition scheme. There are two pos¬ 
sible compression rates for N = 256 (base-2 compres¬ 
sion is suboptimal compared to R = 4), and three for 
N = 6561 = 81^ = 9"*^ = 3®; most other values of N 
will admit only one or none. Notwithstanding the lack 
of tunability, a feasible strategy is to trim or pad the 
template bank such that iV is a perfect square or cube. 


^ The covariance matrix defined by ll30t is rank-deficient, but we 
may take the Moore—Penrose pseudoinverse S'*" as a suitable 
(perturbative) approximation to S~^ in (I12I I and (I13I I. 
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(a) (b) 



FIG. 3: Three-dimensional projection of optimal detection surface for correlated statistics Xk,b of symmetric base scheme with 
N = 256 and B = 4, at true SNRs of (a) 10 and (b) 100. 


N = 256 



FIG. 4: ROG curves for the symmetric base scheme’s opti¬ 
mal, maximum-overlap and estimated detection statistics, at 
different values of base B (with compression rate k in paren¬ 
theses) for a 256-template bank and a true SNR of 10. 


since the smallest values of K already yield high compres¬ 
sion rates. The GW detection algorithm then follows as 
given in Sec. Ill Al with some estimated detection statistic 
in place of X 

max- 

One key feature of the symmetric base scheme and 
other lossless methods of compression is automatic iden¬ 
tification of the GW signal (upon detection). In this case, 
the label of the identified template in base-B representa¬ 
tion is given digit-wise by the largest conic statistic eval¬ 


uation := maxf,{Xfc_f,} for each value of k. How¬ 

ever, as each digit k is identified individually, the overall 
identification accuracy falls off severely with increasing 
K (i.e. the total number of digits). 

A possible modification for better accuracy is to con¬ 
sider the i-\-l largest Xk^b for each k and perform an ad¬ 
ditional fine-grained search over the {i + \)^ templates, 
which increases the computational cost accordingly. The 
standard GW identification algorithms h follow (after de¬ 
tection) as: (iii) evaluate the original statistics ([S]) for all 
n G Vi, where 

K i+l 

V.= nUUfc,0> (33) 

fc=ij=i 

with Uj. (j) corresponding to the j-th largest conic statis¬ 
tic evaluation for each k; (iv) identify max„gVi{2;ra}- Au¬ 
tomatic identification is recovered for f = 0, where steps 
(iii) and (iv) become unnecessary as card(Vo) = 1. 

For large values of K (small values of B), the standard 
identification algorithms might still suffer from poor ac¬ 
curacy. One alternative algorithm is obtained by defin¬ 
ing some threshold Xt and considering all conic statis¬ 
tic evaluations Xk^b > Xt, then performing the addi¬ 
tional fine-grained search over all the corresponding tem¬ 
plates. Such a threshold may be set prior to data-taking; 
if < Xt for some value of k, the fc-th digit of 

the number is unconstrained and templates correspond¬ 
ing to all possible choices of that digit are considered. 
Alternatively, Xt may be based on the data by setting 
Xt = / minfc{Afc for some fixed fraction /, which en¬ 
sures that at least one possible value is identified for each 
digit. Both approaches will in general yield increased 
accuracy, but they offer less control over the number of 
conic statistic evaluations considered and hence the over¬ 
all computational cost. 






























C. Binomial coefficient scheme 


The symmetric base labelling method is not the only 
construction of the sets Um that preserves both lossless 
compression (automatic identification) and equal treat¬ 
ment of templates (conditions (b) and (c)). In general, 
we may represent any assignment of N templates to M 
sets with a collection of N M-digit binary labels, where 
the m-th digit of each label is 1 if it appears in Um and 
0 otherwise. Condition (c) implies that each label must 
appear in exactly R sets, and hence contain exactly R 
I’s. In addition, condition (b) defines the set cardinality 
C = card(Um) for all m G M, which yields the con¬ 
straint NR = MC (each of the N labels appears exactly 
R times across all sets, while each of the M sets contains 
exactly C labels). For some given integers N > M > R, 
this constraint is equivalent to the existence of 


M 


(34) 


which is both a necessary and sufficient condition for such 
a set construction to be possible pfij . 

We now require that the conic statistics © are corre¬ 
lated symmetrically, as seen in the partition scheme (but 
not the symmetric base scheme). This additional con¬ 
dition implies that the intersection of each pair of sets 
has fixed cardinality I, i.e. card(Um H IJm') = I for all 
distinct m,m' G M. Considering the family of all such 
intersections then yields the constraint NR{R — 1) = 
M{M — 1)1 (each of the N labels appears exactly ^€2 
times across all intersections, while each of the *^€2 in¬ 
tersections contains exactly / labels). For some given in¬ 
tegers N > M > R and C satisfying ([Ml). this constraint 
is equivalent to the existence of 


NR{R-l) ^ C{R-l) + 
M(M-l) M-1 


(35) 


which is a necessary (but not in general sufficient) con¬ 
dition for such a set construction to be possible. 

The general construction of a family of sets under the 
constraints (IM|) and (IM|) is an open problem in combi¬ 
natorial design theory (see App. El). In this paper, we 
restrict our focus to a special case that may be treated 
in greater detail. Every M-digit binary number with ex¬ 
actly i? I’s is taken to represent a distinct template label; 
the set cardinality then equals the number of (M — 1)- 
digit binary numbers with exactly (i? — 1) I’s, while the 
intersection cardinality of each pair of sets equals the 
number of (M — 2)-digit binary numbers with exactly 
(i? — 2) I’s. Hence for all distinct m,m' G M, we have 

N = ^Cn, C = / = ^-^Cfl_2, (36) 


such that (|M1) and (l35l) are satisfied. We refer to this as 
the binomial coefficient scheme, for obvious reasons. The 
usual ordering of the binary numbers gives a natural map 
onto the original label collection N = {n G Z+ \n < N}, 


although the inverse map is analytically nontrivial (but 
straightforward in practice). 

As the binomial coefficient scheme shares many simi¬ 
larities with the symmetric base scheme, we only high¬ 
light its key features in this section. The compres¬ 
sion parameter is chosen as K = R, such that maxi¬ 
mal compression is given by AT = cbc~^(A^)/2 (where 
cbc(M) := F(M -|- l)/r(M/2 -|- 1)^ is the continuous ex¬ 
tension of the central binomial coefficient ^Cm/ 2 )- Com¬ 
pression rates again depend on the size of the template 
bank; at small values of K, they are only slightly higher 
than those of the symmetric base scheme. 

From HU) cllld (j20p , WG llclVG 


R 

E{X^)=AY,Srm, (37) 

r—1 

COv(Xm,Xm') = ^~‘^Ch- 2 ^ Smm' + , (38) 

where the sets have been relabelled such that 1 G Ur for 
1 < r < R without loss of generality. Considering the 
N possible assignments of 1 G N, the optimal detection 
statistic follows from m and (US as 

Aopt = ^ exp + (/3 - a)Atr(A)^ 

X ^ exp (aA ^ Ar j , (39) 

i=l \ rGRi / 


M-1 


C_R_1 


M-2 


C_R_1 


(40) 


where the sets Ri are the N distinct ^-combinations of 
the collection M. 

All the conic statistics X^ are correlated symmetri¬ 
cally, as seen in ([38|l . Upon projection onto any three- 
dimensional subspace, the optimal detection surfaces 
given by Aopt = A resemble those in Fig. [3] at both low 
and high SNR. It follows that the maximum-overlap de¬ 
tection statistic is again an inadequate approximation to 
the optimal statistic, and we are compelled to use dMl) 
itself (assuming an accurate estimate of true SNR). We 
do not include ROC curves for the binomial coefficient 
scheme here, as they are very similar to those in Fig. 01 

A direct comparison of the base and binomial schemes 
is difficult, since there are few suitable values of N that 
are exactly valid for both schemes. Lack of tunability is 
also more of an issue for the binomial scheme: the only 
values of N that admit more than one nontrivial com¬ 
pression rate might be the Singmaster numbers (which 
admit two as they appear six times in Pascal’s triangle), 
and it is not known whether any number admits more 
than two (apart from N = 3003 = ^®C 2 = ^^Cs = ^^Cq) 
[ 2 ^ . The problem may be overcome by considering 
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a more general compression scheme satisfying the con¬ 
ditions ([M]) and (1551) . This is beyond the scope of the 
current paper due to the complexity of set construction 
(see App. but might be investigated for specific tem¬ 
plate banks in the future. 

The GW detection algorithm for the binomial coeffi¬ 
cient scheme is as given in Sec. Ill Al with some estimated 
detection statistic in place of X^ax- Automatic 

identification is available as well, with the label of the 
identified template given uniquely by the R largest conic 
statistic evaluations. For higher accurate-identification 
rates, a possible alternative is to consider the R+i largest 
Xm and perform an additional fine-grained search over 
the templates. The standard GW identification 

algorithms b follow (after detection) as: (iii) evaluate the 
original statistics for all n G V^, where 

V.= U flUo), (41) 

fc=i jeJfc 

with corresponding to the j-th largest conic statistic 
evaluation and the sets Jfc given by the distinct 

i?-combinations of {j G < i?-I-f}; (iv) identify 

max„gVi{a^n}. Automatic identification is recovered for 
j = 0, where steps (iii) and (iv) become unnecessary as 
card(Vo) = 1. 

We note that another possible scheme would be a “di¬ 
rect sum” of the partition scheme and either the symmet¬ 
ric base or binomial coefficient scheme. The collection 
of template labels is first partitioned into subcollections, 
each of which is further decomposed into smaller sets 
via one of the correlated schemes; these sets may also be 
recombined across the initial partition for increased com¬ 
pression. We do not consider this further here, but such 
an approach would overcome some of the difficulties as¬ 
sociated with the restricted values of N for the base and 
binomial schemes. 


D. Performance comparison 

In this section, we compare the performance of the un¬ 
correlated partition scheme and its two correlated alter¬ 
natives across three areas: template bank compression, 
GW detection and GW identification (i.e. localisation 
to a single template). The detection and identification 
plots here (and throughout the rest of the paper) were 
obtained using 10^-trial Monte Carlo simulations, and so 
the errors on each plot point are ^ 10“^ for a one-sigma 
binomial confidence interval. 

Log-log plots of M against N for various conic com¬ 
pression schemes are shown in Fig.jSj where the maxi¬ 
mum lossless compression provided by the binary scheme 
M has also been included for reference. As alluded 
to in Secs III Atlll Cl the partition scheme has the largest 
range of compression rates, both in terms of compression 
bounds (plot area) and admitted rates (discrete density. 


not shown). The two correlated schemes cover similar 
areas at lower densities in Fig.jSJ with the binomial coef¬ 
ficient scheme offering slightly greater compression. 

Detection performance for each compression setting of 
a given scheme may be measured by detection sensitivity 
at a fixed false alarm rate (which is simply read off the 
corresponding ROC curve), or by a summary statistic 
that captures most of the information contained in an 
ROC curve (e.g. the area Arqc under the curve). Since 
an ROC curve always lies above the no-discrimination 
line Pd = Pr, we define the discrimination 

D := 2Aroc - 1, (42) 

which serves as a measure of how well the detection statis¬ 
tic discriminates between true and false positives. 

Fig.lHKa) shows plots of discrimination against com¬ 
pression for the three proposed schemes at different val¬ 
ues of true SNR, with N Ri 256. We use the maximum- 
overlap detection statistic in lieu of the optimal statistic 
for the partition scheme, and are compelled to choose 
N = 210 for the binomial coefficient scheme. The three 
schemes have comparable performance at lower SNRs, 
but the partition scheme begins to outpace its correlated 
alternatives as SNR increases. 

To compare identification performance (after a true de¬ 
tection), we consider plots of accurate-identification rate 
Pj against compression, but only for the fastest stan¬ 
dard algorithms of each scheme (i.e. R for the parti¬ 
tion scheme, and automatic identification Ig for the cor¬ 
related schemes). The rate Pj for each plot point is calcu¬ 
lated using all and only the trials with the injected signal 
present, and therefore assumes perfect detection through¬ 
out {Pd = 1 and Pr = 0)- This decouples identification 
from detection: it allows standardised comparison of the 
schemes at a fixed false alarm rate, and does not penalise 
the identification performance of any method for having 
inferior detection performance. 

As seen in Fig.jSKb), the usefulness of lossless compres¬ 
sion and automatic identification is limited in the pres¬ 
ence of noise; the addition of a simple fine-grained search 
to the partition scheme is enough to yield significantly 
higher identification accuracy even at marginally lower 
compression. The turnaround in accurate-identification 
rates for the partition scheme at larger values of P is due 
to the additional statistic evaluations used in the fine¬ 
grained search, which for R gives Xevai = M + P in m- 
Since M = N/P, k{P) has one turning point. For this 
example, P = 8 and P = 64 provide the same level of 
compression; identification accuracy is higher for the for¬ 
mer at p = 10, similar for both at p = 4, and higher for 
the latter at p = 2. 

In summary, the partition scheme offers better overall 
performance than its correlated alternatives at the same 
level of compression. For GW detection, the introduced 
correlations among the conic statistics lead to slightly re¬ 
duced detection sensitivity and discriminatory power at 
high SNR; furthermore, the potential benefits of lossless 
compression for GW identification turn out to be nulli- 
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FIG. 5: Log-log plots of M against N for various compression schemes. For each tunable scheme, the corresponding shaded 
region indicates the range of possible compression rates (with the trivial compression setting K = 1 excluded). Not every point 
in this region is realisable in practice, as discussed in the text. 


(a) 


N = 256 



(b) 


,V s 256 



FIG. 6: Plots of (a) discrimination D and (b) accurate-identification rate Pi against compression rate k for the partition, 
symmetric base and binomial coefficient schemes, at different values of true SNR p for a ~ 256-template bank. 
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fied by the effects of noise. Hence there appears to be lit¬ 
tle reason for using correlated schemes over the partition 
scheme, which is more promising as it is easy to imple¬ 
ment and admits a relatively populated sliding scale of 
compression rates. We further investigate and implement 
the partition scheme as the representative conic compres¬ 
sion scheme in Secs IIIIl and IIVI 

III. ORTHOGONALITY AND SUBSPACES 

The conic compression schemes proposed in Sec.ITTlare 
fully general, in the sense that no prior assumptions 
about the template bank are made apart from (jS]) and 
®. These orthogonal and 1-D restrictions are neither 
realistic nor optimal, as template banks typically fea¬ 
ture highly correlated neighbouring templates and are 
unlikely to contain a template exactly proportional to 
the GW signal itself. In this section, we discuss the (sep¬ 
arate) lifting of each assumption for the partition scheme, 
and the resultant effects on detection sensitivity and lo¬ 
calisation accuracy. Each approach may be viewed as 
a simplified limiting case of an actual template bank, 
which can always be made dense enough to include a 
signal-proportional template (assuming model accuracy), 
or orthogonalised. A more realistic example with both 
assumptions lifted is considered in Sec. IIYI 

A. Non-orthogonal templates 

We first consider a sufficiently dense bank of correlated 
(non-orthogonal) templates, such that the GW signal still 


lies in the 1-D subspace spanned by a single template in 
Hilbert space. From the first equalities in (nni, (HH), (HU) 
and (1^ . it follows in the presence of a GW signal that 

E(A™) = A ^ (/ii|h„), (43) 


COv(^j77,, ) - E E {hn\hn')- (44) 

nGUm n'GUni' 

Any partition of N as in Sec. Ill Al defines a splitting of 
the (sorted) original mean vector and covariance matrix 
into P X 1 blocks and P x P blocks respectively; each 
entry in the conic mean vector and covariance matrix is 
then simply the sum of entries in the corresponding block, 
which reflects the coarse-graining of the compression. 

As a toy model for investigating non-orthogonal tem¬ 
plates, we use a frequency-parametrised bank of sinu¬ 
soidal waveforms h = sin ( 27 r/t) with finite observation 
time T. Assuming white noise for simplicity, the inner 
product ® may be written as 


(A|P) oc / X{t)P{t) dt. (45) 

Jo 

For an A-template bank with /min < / < /max and Sf := 
(/max - /min)/(A^ - 1) < /min, the Overlaps are given by 


{hn\hn+An) ^2 f sin (27r/mini) sin (27r(/min + \ An\Sf)t) dt, (46) 

Jo 


where we have normalised to P = 1 such that / is given 
in waveform cycles per observation time. This sinc-like 
function of An S Z yields a band covariance matrix for 
Xn] we set N = 256, and choose the frequency bounds 
such that cov(xr, ,Xn+^) « 0.97 [H, (i.e. a maxi¬ 

mal mismatch of around 0.03). 

In contrast to the orthogonal case, the choice of par¬ 
tition generally affects the performance of the parti¬ 
tion scheme for non-orthogonal templates. For the one- 
parameter template bank with overlaps given by (I46L 
we consider both a randomised partition and a more 
optimised (but not necessarily optimal) partition with 
Um = {n S Z+ I (m — I)P < n < mP}. We also include 
for comparison a uniformly spaced M-template subset of 
the original bank (equivalently, Um = {n € Z+ | n = 
mP} where UmeM Um ^ N). This “coarsened” tem¬ 


plate bank is not compressed; it is obtained in a more 
straightforward way by simply reducing the correlation 
(increasing the maximal mismatch) between neighbour¬ 
ing templates. The standard detection algorithm out¬ 
lined in Sec. Ill A] is then applied for the two partition 
schemes and the coarsening method. 

Fig-EDa) shows plots of discrimination (using A„iax) 
against compression for both choices of partition and the 
coarsened template bank, where performance in the pres¬ 
ence of a GW signal is averaged over the N possible lo¬ 
cations of the corresponding template in the bank. The 
optimised partition (with highly correlated templates 
grouped together) outperforms its randomised counter¬ 
part at all considered values of true SNR. It also shows 
significant improvement over the coarsening method at 
higher compression rates, which is expected as it uses 
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(a) (b) 

= 256 ,V = 256 




FIG. 7: Plots of (a) discrimination D against compression rate k for the randomised/optimised partition scheme and the 
coarsening method, and (b) accurate-identification/localisation rate Pi against compression rate k for the optimised partition 
scheme and the coarsening method, at different values of true SNR p for a non-orthogonal 256-template bank. Accurate 
localisation here is defined as the identification of the template hi or one of the nearest P — 1/(1 — k) templates. 


information from the full A^-template bank rather than 
just an M-template subset. 

The largest statistic evaluation for the coarsened tem¬ 
plate bank identifies a best guess for the GW signal, but 
the accuracy of this identification is zero if the signal 
does not correspond to a template in the coarsened bank. 
Since the spacing of the coarsened bank is P, we may con¬ 
sider the best-guess template as representative of the P 
templates nearest to it (or P — 1 if P is odd), and say 
that the largest statistic evaluation localises a best guess 
for the signal. We then define the localisation to be accu¬ 
rate if the correct template hi is one of those templates 
(equivalently, if the identified best-guess template is hi or 
one of the P templates nearest to hi). The identification 
algorithms in Sec. Ill Al also identify a single best-guess 
template for the partition scheme, which allows us to 
consider both accurate identification (to a precision of 1) 
and accurate localisation (to a precision of P) in the same 
way. Fig. [7Kb) shows plots of accurate-identification and 
localisation rates (using Ii, which gives fVevai = M + P 
in (|2T]|1 against compression for the (optimised) partition 
scheme and the coarsening method. 

As in Sec.HTPl the turnaround in accurate- 
identification and localisation rates for the partition 
scheme is due to the additional statistic evaluations of 
the hne-grained search. The localisation rates increase 
up to some level of compression, which is mainly because 
“accurate” localisation is defined up to a degree of 
precision that degrades with compression; this effect is 
seen for the coarsening method as well. Localisation to 


within the spacing of the original template bank (i.e. 
identification) decreases monotonically in accuracy for 
the partition scheme, and will not be achievable for the 
majority of signals with the coarsening method. The 
partition scheme localises the GW signal with slightly 
greater accuracy than the coarsening method, and in 
fact identifies it with virtually no fall-off in accuracy at 
significant compression levels. 

Increasing the correlation between neighbouring tem¬ 
plates is known to improve the detection and localisation 
performance of a general template bank [U, [2^ . Re¬ 
sults in this section illustrate that the partition scheme 
retains these benehts up to high levels of compression, 
and provides a superior alternative to simply coarsen¬ 
ing the template bank for computational savings. The 
viability of conic compression becomes even more evi¬ 
dent in Sec. lYl where we apply the partition scheme to 
a larger and more broadly correlated two-parameter tem¬ 
plate bank. 

B. 2-D subspace 

Throughout Secs|TTl and IIII Al we have assumed that 
the GW signal is exactly proportional to a template 
in the bank. To understand the impact on compres¬ 
sion performance when this is not the case, we consider 
a bank of N uncorrelated templates obtained through 
some orthogonalisation procedure (e.g. as in [IGI - II^ ') 
on a general template bank, and a signal lying in the 
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FIG. 8: Plots of (a) discrimination D and (b) accurate-identification rate Pi against compression rate k for the partition scheme 
with 1-D and 2-D signals, at different values of true SNR p for a 256-template bank. The higher dotted curves for each value 
of p correspond to the template labels 1 and 2 being assigned to the same set, while the lower curves correspond to them being 
assigned to different sets. Accurate identihcation for the 2-D case is dehned as the identification of both templates hi and h 2 . 




IV-dimensional Hilbert space spanned by the orthogonal 
set. If N is large, the signal is typically restricted to a 
low-dimensional subspace (this follows from the volume 
of an iV-sphere). For simplicity, we assume it lies exactly 
between two templates in a 2-D subspace, i.e. 

S{t) = A{hi{t) + h2{t)), (47) 

where the templates have been relabelled without loss of 
generality and A = pj^Pi from (j4]). Hence the expecta¬ 
tion values of the original and conic statistics become 

E(a;„) = A{Sin + S2n), (48) 

E(X„) = Hcard({l,2}nU™), (49) 

while their covariances remain as (HU and (EOl) respec¬ 
tively. The assumption ()47l) is the worst-case scenario for 
a 2-D subspace, since the signal is maximally far from 
both templates in the subspace. 

Although it is not possible to pre-optimise the choice 
of partition for orthogonal templates, the performance of 
the partition scheme in the presence of a 2-D GW signal 
m falls into two partition-dependent cases. At small 
values of P, it is more likely that the labels 1 S Um and 
2 e IJm' are assigned to different sets {m ^ m'); as P 
increases, so does the probability that they are assigned 
to the same set (m = m'), which improves performance 
(e.g. the effective SNR for detection purposes is raised 
by a factor of pi). 


The standard detection algorithm in Sec. Ill Al is ap¬ 
plicable for a 2-D signal, while the standard identifica¬ 
tion algorithms may be generalised at step (iv) by con¬ 
sidering the two largest original statistic evaluations in¬ 
stead. Fig. [8] shows plots of discrimination and accurate- 
identification rate against compression for a 2-D signal 
iS oc hi -(- /i 2 , compared against a 1-D signal S cc hi with 
the same true SNR p. The identification algorithm I 2 is 
used, since the accuracy rate of R falls to zero if m ^ m'. 
This gives A^evai = M + 2P in (l2^ . 

For detection of a 2-D GW signal, the effectiveness of 
the partition scheme is reduced slightly at lower SNRs, 
but mitigated by the case where m = m' (i.e. the higher 
dotted curves in Fig. B- Detection performance for this 
special case actually improves up to some level of com¬ 
pression, which is possible as the symmetry among all 
possible signals is broken (by the partitioning process). 
A similar effect is seen for the example in Sec. lIVl The 
discrimination for a 1-D signal generally lies within the 
2-D discrimination bounds; at higher compression rates, 
there is little to no detection performance lost if the sig¬ 
nal is not confined to a 1-D subspace. 

Accurate identification of a 2-D GW signal (i.e. the 
identification of both hi and / 12 , in this toy model) is 
more problematic than in the 1-D case, since accuracy 
rates are reduced to begin with and fall off rapidly even 
at high SNR. Nevertheless, options such as lowering com¬ 
pression or switching to Ii >2 are available for the parti¬ 
tion scheme, which should at least allow the template 
with maximal signal overlap to be identified at accept- 
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able accuracy rates. 

If the true SNR is sufficiently high, the standard algo¬ 
rithms Ii>j may also be used to identify a j-dimensional 
GW signal described by an arbitrary linear combination 
of templates, i.e. 


3 

S{t) = ^^Akh},{t), (50) 

k^l 

where Af- > A^+i and the templates have been relabelled 
without loss of generality. At step (iv) of the algorithms, 
each ordered weight Aj, may be approximated by the k- 
th largest original statistic evaluation X(^k), with the SNR 
of the identified signal given by 


PI 







(51) 


While this method fully recovers the (relative) weights 
of a GW signal’s j largest modes in the limit of infinite 
true SNR, its accuracy might be limited for lower-SNR 
signals and/or large values of P. 


IV. EXAMPLE: TAYLOR-T2 TEMPLATE BANK 

In this section, we implement the (optimised) partition 
scheme described in Secs III Al and IIII Al for a larger and 
more realistic example: a two-parameter template bank 
of mixed-order PN waveforms, which describe the grav¬ 
itational radiation emitted during the inspiral part of a 
comparable-mass binary merger. An optimised partition 
in this case (and in general) refers to a partition of the 
template bank such that highly correlated templates are 
grouped together as much as possible. 

The waveform family we use is the Taylor-T2 approx- 
imant [13, mi for a circular and non-inclined binary 
with comparable component masses mi > m 2 . These 
waveforms are parametrised by their chirp mass M = 
j (mi -I- 1112 )^^^ and symmetric mass ratio rj = 
mim 2 /(mi -l-m 2 )^, and are written as PN expansions in 
the frequency-related variable x = , 

where 0 is the time derivative of the orbital phase 0. We 
truncate the PN expansions at finite order, specifying 
the phase, amplitude and mass monopole to 3.5PN, 2PN 
and IPN respectively; the resultant mixed-order wave¬ 
form may be written compactly as [32| 

hM^t) = (52) 

where R is the source distance (which the true SNR p 
is inversely proportional to), and expressions for the am¬ 
plitude function A and tail-distorted orbital phase 0; are 
given in App. |B] 

Template bank compression is potentially more impor¬ 
tant for analysing data from the low-frequency eLISA 
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FIG. 9: Matrix/contour plots of the expectation values E(a;„) 
and E{Xm) for the partition scheme, at different values of set 
cardinality P (with compression rate k in parentheses) for a 
Taylor-T2 GW signal (red cross) injected between the four 
central templates of a (128 x 128)-template Taylor-T2 bank. 
The signal has chirp mass M — 10® Mq and symmetric mass 
ratio p = 0.15, while the bank is gridded uniformly in linearly 
transformed parameters M' (increasing from top to bottom) 
and p' (increasing from left to right) with maximal mismatch 
~ 0.01. Overlap values depend on the true SNR p (set to 1 
in these plots), and range from positive (orange) to negative 
(blue) in some subinterval of {—Pp,Pp). 


detector, since the long duration of sources in the eLISA 
band results in a much larger number of templates re¬ 
quired to cover parameter space [23|. As mergers of 
massive black-hole binaries are an anticipated source for 
eLISA [3|, we consider a Taylor-T2 GW signal with the 
parameters 6c = (1, 0.15), where 9 := (A4/(10® Mq), p); 
this corresponds to a binary black-hole inspiral with com¬ 
ponent masses {mi, m 2 ) = (1.9A4,0.7A4). The duration 
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(a) (b) 

,V= 128x128 N = 128x128 



FIG. 10: Plots of (a) detection rate Pd (at fixed false alarm rate Pf) and (b) accurate-localisation rate Pi (to nearest v 
templates) against compression rate k for the optimised partition scheme and the coarsening method, at different values of 
true SNR p for a GW signal injected with central parameters Oc- Accurate localisation here is defined as the identification of 
a template within the central squares of v templates. 


of the signal is set to tc = 1 yr. 

We also generate a bank of Taylor-T2 templates with 
the same duration, each normalised with respect to the 
inner product (12|), where is given by a (two-sided) 

analytic approximation to the eLISA noise power spec¬ 
tral density [s^. These templates are gridded uniformly 
in the transformed parameters 9' Oq + L{9 — Oq) 
(128 points in each parameter), with the signal lying in 
the middle of the four central templates and the linear 
transformation L chosen such that the template overlaps 
are isotropic with respect to the grid (at least for the 
central region). The maximal mismatch of each template 
with its four nearest neighbours is around 0.01. 

Since the N = 16384 templates are pre-sorted by the 
(skewed) square grid, an optimised (but not necessarily 
optimal) partition is obtained by the obvious grouping 
into M blocks of \/~P x \/~P templates. This particu¬ 
lar template bank admits six nontrivial square partitions 
with P S {4,16, 64, 256,1024,4096}; we do not consider 
the case P = 4096, as P = 1024 already yields a com¬ 
pression rate of 99.9%. A large number of rectangular 
partitions (where P = 2* with 0 < i < 14) are also pos¬ 
sible, but we omit these here for simplicity as they are 
degenerate with the square partitions and among them¬ 
selves. Square partitions are straightforward to gener¬ 
alise for various lattice choices |34l436l | , and will be fairly 
optimal as long as the templates are gridded uniformly 
in the parameter-space metric. 

The expectation values of the original and conic statis¬ 
tics (the first equality in (fTUl) and (H5)) respectively) are 


visualised in Fig.[^ where the coarse-graining of the com¬ 
pression is evident. Overlaps for the Taylor-T2 template 
bank are much less localised than the toy model overlaps 
in Sec. lIII A} this is due to their wider cycle widths in 
both M. and rj, as well as a slight degeneracy in the two 
parameters (overlaps at the boundary of the first plot in 
Fig. [S] can be as high as 0.4p). As the templates are so 
broadly correlated and the GW signal is injected right in 
the centre of the bank, the partition scheme is expected 
to perform well up to a high level of compression. 

For comparison purposes, we again consider the simple 
coarsening method discussed in Sec. lIII A1 The smaller 
coarsened banks are formed by selecting individual tem¬ 
plates near the centre of each square block in the origi¬ 
nal bank, rather than by summing the templates in each 
block (as in the partition scheme). Detection and locali¬ 
sation performance for both the partition scheme and the 
coarsening method on the Taylor-T2 template bank with 
central injection is summarised in Fig. 1101 The semi-log 
plots in this section use an abscissa of — Ig (1 — k)/ 3, as 
most of the considered compression rates are > 90%. 

Instead of the discrimination (|^ . we quantify detec¬ 
tion performance using the detection rates at two fixed 
false alarm rates Pp = 10“^ and Pp = 10“"^ (the num¬ 
ber of Monte Carlo trials performed for each plot point 
is ^ 10®, and so the errors are ^ 10“® for a one-sigma 
binomial confidence interval). At all considered values 
of SNR and fixed false alarm rate, there is no fall-off 
in the partition scheme’s detection performance up to 
K = 93.8% (and even a slight increase, due to the special 
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(M.i;) = (1.0x10^ A/^.0.I6) (A1,7) = (9.8 xI0^V/^.0.06) 


FIG. 11: Matrix/contour plots of the expectation values 
E(a;n) for Taylor-T2 GW signals (red crosses) injected in a 
(128 X 128)-template Taylor-T2 bank at random (left) and 
near the boundary (right). 


choice of central injection). While this is also the case 
for the coarsening method, detection rates for the parti¬ 
tion scheme are distinctly higher at compression rates of 
> 90%, with improvements of over 0.1 at k = 99.9%. 

The identification algorithm Ii is used to localise the 
GW signal, which gives A^evai = M -b P in Rates 

for accurate localisation to within two central squares of 
12 X 12 templates (corresponding to < 1% of the entire 
bank) and 4x4 templates (< 0.1% of the bank) are 
considered. Localisation is typically improved by com¬ 
pression up to K = 93.7%, which is provided by two dif¬ 
ferent values of P (see discussion in Sec. IllDl) . The two 
values are P = 16, beyond which the matrix/contour 
plot of P{Xm) in Fig. [S] loses scale-similarity to that of 
E(a;ji), and P = 1024, for which performance is regained 
as each conic template incorporates more of the origi¬ 
nal templates and accuracy is added by the fine-grained 
search. Localisation is poorer at k = 98.0%, which cor¬ 
responds to both P = 64 and P = 256. To reduce clut¬ 
ter in Fig. lTUT b). only the higher localisation rates for 
K = 93.7% and k = 98.0% are plotted. The partition 
scheme outperforms the coarsening method at most lev¬ 
els of compression, especially in the case of accurate lo¬ 
calisation to within the smaller square of 4 x 4 templates. 

For the special case of a centrally injected GW sig¬ 
nal, the detection and localisation performance of the 
partition scheme is non-decreasing up to high levels of 
compression and can even rise above that of the original 
template bank; however, this may also be said for the 
coarsening method. To illustrate that the improvement 
of the partition scheme over the coarsening method is not 
simply due to the special choice of injection, we consider 
two other cases: a Taylor-T2 signal injected with ran¬ 
domly drawn parameters Or = (1.0,0.16), and another 
injected near the boundary of the bank with the param¬ 
eters Ob = (0.98,0.06) (i.e. in the middle of the four cor¬ 
ner templates with low chirp mass and symmetric mass 
ratio). The expectation values of the original statistics 
for these two injections are visualised in Fig. 1111 


Fig. HD shows detection and accurate-localisation rates 
for both the partition scheme and the coarsening method 
on the random and boundary injections. The random 
injection is actually recovered with slightly better rates 
than the central injection rates in Fig.fTOl but with a sim¬ 
ilar improvement of the partition scheme over the coars¬ 
ening method. A more marked difference between the 
two methods is obtained for the boundary injection. De¬ 
tection rates for both methods are now non-increasing, 
with the partition scheme showing greater improvement 
over the coarsening method; for the p = 6 case, the im¬ 
provement is around 0.3 at k = 93.8%. Rates for accurate 
localisation of the boundary injection to within the cor¬ 
ner square of 12 x 12 templates follow a similar trend, 
with a largest improvement of around 0.5 (again for the 
p = 6 case at k = 93.7%). 

Detection and localisation performance for this Taylor- 
T2 example is injection-dependent, as it is for any real¬ 
istic template bank: there is clearly no symmetry among 
all possible GW signals, since the templates are asymmet¬ 
rically correlated and the signals may lie between tem¬ 
plates. We have not undertaken a full injection-averaged 
analysis (similar to that performed in Sec. lIII All due to 
the size of the template bank, but overall detection rates 
for such an analysis should decrease monotonically with 
compression as per intuition, with the partition scheme 
outperforming the coarsening method (as it does for the 
three injections presented here, as well as several others 
we have examined). 

The partition scheme is expected to remain robust for 
searches in a (d > 2)-dimensional parameter space. As 
the number of templates that are highly correlated with 
the GW signal increases exponentially with d, enlarging 
the span P of each conic template at the same rate should 
maintain detection and localisation performance while in¬ 
creasing the relative computational savings (which scale 
as 1 — 1/P). Good scaling with parameter-space dimen¬ 
sionality allows conic compression to be competitive with 
other search techniques that reduce computational cost. 
For example, the method of searching over time offset 
(i.e. signal time-of-arrival) using fast Fourier transforms 
yields a logarithmic reduction in the number of search 
points for that parameter [33| . but for multidimensional 
searches an overall logarithmic reduction is easily at¬ 
tained by the partition scheme with little impact on per¬ 
formance. The two methods might even be combined for 
greater savings, by constructing conic sums of templates 
aligned at a fixed reference time and using Fourier trans¬ 
forms of the conic templates to search over time offset. 


V. CONCLUSION 

In this paper, we have presented and compared three 
tunable conic compression schemes (partition, symmet¬ 
ric base and binomial coefhcient) for a general template 
bank in a grid-based GW search. The bank is compressed 
in the preparatory offline stage, which yields faster detec- 
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(a) (b) 

jV= 128x128 N = 128x128 



12 X 12 templates) against compression rate k for the optimised partition scheme and the coarsening method, at different values 
of true SNR p for a GW signal injected with random parameters On and boundary parameters Ob- Accurate localisation here 
is defined as the identification of a template within the sqnare of 12 x 12 templates nearest to each injection. 


tion and localisation of signals by reducing the number 
of inner product evaluations performed online. 

A recently proposed binary labelling method [l^ , mod¬ 
ified to ensure the equal treatment of templates, is con¬ 
tained as a particular case of the symmetric base scheme. 
Optimal detection statistics have been calculated for all 
three schemes under simplified conditions, and the stan¬ 
dard maximum-overlap detection statistic (i.e. the maxi¬ 
mum overlap over all the compressed templates) is shown 
to be significantly suboptimal for the base and binomial 
schemes. While these two lossless schemes provide auto¬ 
matic identification of the GW signal upon detection, the 
benefits of this are negated in the presence of noise; fur¬ 
thermore, the lossy partition scheme offers better detec¬ 
tion and identification performance than its counterparts 
at the same level of compression. 

We have applied the partition scheme to toy models of 
(i) a correlated template bank with a signal-proportional 
template and (ii) a signal lying in the span of orthogonal 
templates, to show that it remains feasible under such 
conditions. These toy models are instructive as they 
represent the two limiting cases of a general template 
bank. Correlations among the original templates result 
in partition-dependent performance, but this may be op¬ 
timised beforehand by grouping highly correlated tem¬ 
plates together; the optimised partition scheme is then 
superior to a simple coarsening of the template bank. If 
the signal is proportional to a linear combination of tem¬ 
plates in an orthogonal bank, the detection performance 
of the scheme is not significantly reduced. 


Conic compression performs well if the original tem¬ 
plate bank is sufficiently correlated, as demonstrated by 
our example implementation of the optimised partition 
scheme for a bank of ^ 10"* PN waveforms. We consider 
a centrally injected GW signal, a randomly injected one, 
and one at the boundary of the bank; again, the scheme 
is superior to the coarsening method across the board. 
The partition scheme is shown to be viable for practical 
applications, as it maintains good detection sensitivity 
and localisation accuracy up to high levels of compres¬ 
sion and at all considered values of SNR for this more 
realistic template bank. 


In summary, our tunable conic compression schemes— 
specifically the optimised partition scheme—might pro¬ 
vide an effective method of improving the speed, detec¬ 
tion sensitivity and localisation accuracy of GW tem¬ 
plate banks. The schemes are potentially useful for any 
search involving template banks, as they are fully gen¬ 
eral and may easily be adapted to supplement existing 
algorithms in GW data analysis pipelines. Gonic com¬ 
pression is also particularly promising in the context of 
eLISA data analysis, where online grid searches are dif¬ 
ficult as computational costs are more prohibitive; for 
example, the method could be used as an online tool to 
rapidly identify nearby sources before merger and gener¬ 
ate alerts for electromagnetic telescopes. 
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Appendix A: Combinatorial design theory 

The problem of constructing a family of sets IJm under 
the cardinality constraints (IMl) and (|35]) in Sec. Ill Cl may 
be regarded geometrically as the problem of constructing 
a collection of N distinct points (representing template 
labels) and M distinct lines (representing sets) with the 
following properties: 

(i) each point lies on exactly R lines; 

(ii) each line passes through exactly C points; 

(iii) any two lines intersect at exactly I points; 

(iv) any two points lie on at most R — 1 lines. 

The final property is the automatic identification condi¬ 
tion, i.e. no two labels are assigned to exactly the same 
subfamily of sets. 

The feasibility of carrying out such a construction (or 
finding additional conditions on N, M and R that en¬ 
sure it is possible) is a difficult and unsolved problem 
in combinatorics. One special case that has been stud¬ 
ied in detail is R = C and 7 = 1. This implies that 
N = M = R^ — R + 1, and that any two points must lie 
on exactly one line. Under these circumstances, the four 
geometrical properties define a finite projective plane of 
order R — 1 |25l | . It is known that finite projective planes 
exist with prime orders [ 2 ^, but there is no finite projec¬ 
tive plane of order 6 [s^ or 10 [s^, while the existence (or 
otherwise) of an order -12 finite projective plane remains 
an open question. 

The special case of finite projective planes is uninter¬ 
esting from a compression-scheme point of view, as it has 
N = M and hence achieves no compression. However, it 


strongly indicates that the conditions (l34t and (1351) are 
not sufficient to ensure the existence of a set construction 
with the four required properties. Nonetheless, valid set 
constructions have been found for small values of N, M 
and R] for example, (A, M, R) — (10, 6 , 3) yields (7 = 5, 
1 = 2, and the set construction 

Ui = {1,2,3,4,5}, 

U 2 = 11,2,6,7,8}, 

U 3 = {1,3,6,9,10}, 

U 4 = {2,5,8,9,10}, 

Us = {3,4,7,8,10}, 

Us = {4,5, 6 ,7,9}. (Al) 

Additional solutions for {N, M,R) = (12,9,3) and 
(TV, M, R) = (14, 7, 3) also exist. No counterexamples 
(i.e. values of (TV, M, R) satisfying (IMl) and (|55|) but ad¬ 
mitting no set construction) have been found for A > M, 
although we have not conducted an exhaustive search. 

A general compression scheme satisfying the conditions 
(IMl) and (1551) might potentially admit more compression 
rates than the symmetric base scheme for each value of 
N. Given the difficulties in actually constructing the sets, 
however, we focus instead on the special case of “maximal 
representation” for fixed M and R (i.e. every M-digit 
binary number with exactly i? I’s represents a distinct 
template label); this gives the binomial coefficient scheme 
described in Sec. Ill Cl 


Appendix B: Taylor-T2 PN expansions 

The Taylor-T2 PN waveform (1521) used in Sec. lIVI de¬ 
scribes the inspiral part of a circular and non-inclined 
comparable-mass binary merger [sol - fs^ . Its amplitude 
and phase are written as expansions in the frequency- 
related variable 



with the orbital phase (f) given to 3.5PN accuracy by 
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where 7 is the Euler-Mascheroni constant. Here r is 
a time-related variable, written in terms of the binary 
coalescence time tc as 
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while the GW phase is twice the tail-distorted orbital 
phase 


„3 8/5 

where we set G = lyr for a massive 10 ®Mq) black- 
hole binary inspiral. 

The GW amplitude is then proportional to the 2PN 
amplitude function 

H = a; ^2 -I- -(— 13 -|- ri)x + 

+ Y^(-837 - 635ry + , (B4) 


= (B5) 


with the IPN factor of 1 — (??/2)x included to account for 
the nonlinear interaction between the gravitational field 
of the source and its emitted gravitational radiation [s^ . 
The constant frequency in a;(0) is set to (j){0) = 10 “'^ 7 r, 
which corresponds to an approximate entry frequency of 
10“"^ Hz for the eLISA detector [s^. 
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